Based on the post: Backpropagation with Tensor Flow

Tensor Flow library load.



In [1]:

    
import tensorflow
import tensorflow as tf

Import MNIST data from Tensor Flow.



In [2]:

    
from tensorflow.examples.tutorials.mnist import input_data
mnist = input_data.read_data_sets("MNIST_data/", one_hot=True)









    



Extracting MNIST_data/train-images-idx3-ubyte.gz
Extracting MNIST_data/train-labels-idx1-ubyte.gz
Extracting MNIST_data/t10k-images-idx3-ubyte.gz
Extracting MNIST_data/t10k-labels-idx1-ubyte.gz

All targets will be feed using two placeholders. a_0 will be the images, and y will be the target.



In [3]:

    
a_0 = tf.placeholder(tf.float32, [None, 784])
y = tf.placeholder(tf.float32, [None, 10])

Now we fill the weights and biases.



In [4]:

    
middle = 30
w_1 = tf.Variable(tf.truncated_normal([784, middle]))
b_1 = tf.Variable(tf.truncated_normal([1, middle]))

w_2 = tf.Variable(tf.truncated_normal([middle, 10]))
b_2 = tf.Variable(tf.truncated_normal([1, 10]))

Sigmoid function.



In [5]:

    
def sigma(x):
    return tf.div(tf.constant(1.0),
                  tf.add(tf.constant(1.0), tf.exp(tf.neg(x))))

Forward function. Now we fill the values of z1 and z2 making a matrix multiplication (tf.matmul) and after that sum the values of z.



In [6]:

    
z_1 = tf.add(tf.matmul(a_0, w_1), b_1)
a_1 = sigma(z_1)

z_2 = tf.add(tf.matmul(a_1, w_2), b_2)
a_2 = sigma(z_2)

Difference (substraction) between the output and the vector y



In [7]:

    
diff = tf.sub(a_2, y)

To made the backward propagation will need of derivate of Sigmoid function. This will be the Sigmoid prime function.



In [8]:

    
def sigmaprime(x):
    return tf.mul(sigma(x), tf.sub(tf.constant(1.0), sigma(x)))

Computation of deltas of the weights and biases.



In [9]:

    
d_z_2 = tf.mul(diff, sigmaprime(z_2))
d_b_2 = d_z_2
d_w_2 = tf.matmul(tf.transpose(a_1), d_z_2)

d_a_1 = tf.matmul(d_z_2, tf.transpose(w_2))
d_z_1 = tf.mul(d_a_1, sigmaprime(z_1))
d_b_1 = d_z_1
d_w_1 = tf.matmul(tf.transpose(a_0), d_z_1)

Update the network.



In [10]:

    
eta = tf.constant(0.5)
step = [
    tf.assign(w_1,
            tf.sub(w_1, tf.mul(eta, d_w_1)))
  , tf.assign(b_1,
            tf.sub(b_1, tf.mul(eta,
                               tf.reduce_mean(d_b_1, reduction_indices=[0]))))
  , tf.assign(w_2,
            tf.sub(w_2, tf.mul(eta, d_w_2)))
  , tf.assign(b_2,
            tf.sub(b_2, tf.mul(eta,
                               tf.reduce_mean(d_b_2, reduction_indices=[0]))))
]

Train and test the network using batch of ten.



In [11]:

    
acct_mat = tf.equal(tf.argmax(a_2, 1), tf.argmax(y, 1))
acct_res = tf.reduce_sum(tf.cast(acct_mat, tf.float32))

sess = tf.InteractiveSession()
sess.run(tf.initialize_all_variables())
#sess.close()

#for i in xrange(10000):
for i in range(10000):
    batch_xs, batch_ys = mnist.train.next_batch(10)
    sess.run(step, feed_dict = {a_0: batch_xs,
                                y : batch_ys})
    if i % 1000 == 0:
        res = sess.run(acct_res, feed_dict =
                       {a_0: mnist.test.images[:1000],
                        y : mnist.test.labels[:1000]})
        print (res)



In [ ]: